A heuristic block coordinate descent approach for controlled tabular adjustment

نویسندگان

  • José A. González
  • Jordi Castro
چکیده

One of the main concerns of national statistical agencies (NSAs) is to publish tabular data. NSAs have to guarantee that no private information from specific respondents can be disclosed from the released tables. The purpose of the statistical disclosure control field is to avoid such a leak of private information. Most protection techniques for tabular data rely on the formulation of a large mathematical programming problem, whose solution is computationally expensive even for tables of moderate size. One of the emerging techniques in this field is controlled tabular adjustment (CTA). Although CTA is more efficient than other protection methods, the resulting mixed integer linear problems (MILP) are still challenging. In this work a heuristic approach based on block coordinate descent decomposition is designed and applied to large hierarchical and general CTA instances. This approach is compared with CPLEX, a stateof-the-art MILP solver. Our results, from both synthetic and real tables with up to 1,200,000 cells, 100,000 of them being sensitive (resulting in MILP instances of up to 2,400,000 continuous variables, 100,000 binary variables, and 475,000 constraints) show that the heuristic block coordinate descent has a better practical behaviour than a state-of-the-art solver: for large hierarchical instances it provides significantly better solutions within a specified realistic time limit, as required by NSAs in real-world.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fix-and-relax approaches for controlled tabular adjustment

Controlled tabular adjustment (CTA) is a relatively new protection technique for tabular data protection. CTA formulates a mixed integer linear programming problem, which is challenging for tables of moderate size. Even finding a feasible initial solution may be a challenging task for large instances. On the other hand, end users of tabular data protection techniques give priority to fast execu...

متن کامل

Using BCD-CTA for difficult tables: a practical experiment with a real Eurostat table

CTA is a post-tabular perturbative approach for statistical disclosure control. Its purpose is to compute the closest safe table to the original data, using some distance. Sensitive cells are adjusted either upwards or downwards (binary decision), and the resulting cells have to be accordingly (and minimally) modified to preserve marginals. For real and large tables, CTA may result in a difficu...

متن کامل

Exact, Heuristic and Metaheuristic Methods for Confidentiality Protection by Controlled Tabular Adjustment

Government agencies and commercial organizations that report data face the task of representing the data meaningfully while simultaneously protecting the confidentiality of critical data components. The challenge is to organize and disseminate data in a form that prevents these components from being unmasked by corporate espionage, or falling prey to efforts to penetrate the security of the inf...

متن کامل

Present and future research on controlled tabular adjustment

Controlled tabular adjustment (CTA) can be classified within the group of approaches that perturb output data (i.e., tabular data), unlike other methods that focus on the original microdata. Being a post-tabular data perturbation technique it becomes easier to guarantee consistency and quality of the released information (e.g., table additivity, preservation of subtotal or total cells of the or...

متن کامل

Testing variants of minimum distance controlled tabular adjustment

Controlled tabular adjustment (CTA), and its minimum distance variants, is a recent methodology for the protection of tabular data. Given a table to be protected, the purpose of the method is to fi nd the closest one that guarantees the confi dentiality of the sensitive cells. This is achieved by adding slight adjustments to the remaining cells, preferably excluding total ones, whose values are...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computers & OR

دوره 38  شماره 

صفحات  -

تاریخ انتشار 2011